349 research outputs found

    Biology of Genomes: making sense of sequence

    Get PDF
    A report on the Biology of Genomes meeting held at Cold Spring Harbor Laboratory, NY, USA, 5-9 May 2009

    The promise and reality of personal genomics

    Get PDF
    The second personal genome sequence of a Korean tells something about genetic ancestry but still little of medical relevance

    Association of a Low-Frequency Variant in HNF1A With Type 2 Diabetes in a Latino Population

    Get PDF
    Importance: Latino populations have one of the highest prevalences of type 2 diabetes worldwide. Objectives: To investigate the association between rare protein-coding genetic variants and prevalence of type 2 diabetes in a large Latino population and to explore potential molecular and physiological mechanisms for the observed relationships. Design, Setting, and Participants: Whole-exome sequencing was performed on DNA samples from 3756 Mexican and US Latino individuals (1794 with type 2 diabetes and 1962 without diabetes) recruited from 1993 to 2013. One variant was further tested for allele frequency and association with type 2 diabetes in large multiethnic data sets of 14 276 participants and characterized in experimental assays. Main Outcome and Measures: Prevalence of type 2 diabetes. Secondary outcomes included age of onset, body mass index, and effect on protein function. Results: A single rare missense variant (c.1522G>A [p.E508K]) was associated with type 2 diabetes prevalence (odds ratio [OR], 5.48; 95% CI, 2.83-10.61; P = 4.4 × 10−7) in hepatocyte nuclear factor 1-α (HNF1A), the gene responsible for maturity onset diabetes of the young type 3 (MODY3). This variant was observed in 0.36% of participants without type 2 diabetes and 2.1% of participants with it. In multiethnic replication data sets, the p.E508K variant was seen only in Latino patients (n = 1443 with type 2 diabetes and 1673 without it) and was associated with type 2 diabetes (OR, 4.16; 95% CI, 1.75-9.92; P = .0013). In experimental assays, HNF-1A protein encoding the p.E508K mutant demonstrated reduced transactivation activity of its target promoter compared with a wild-type protein. In our data, carriers and noncarriers of the p.E508K mutation with type 2 diabetes had no significant differences in compared clinical characteristics, including age at onset. The mean (SD) age for carriers was 45.3 years (11.2) vs 47.5 years (11.5) for noncarriers (P = .49) and the mean (SD) BMI for carriers was 28.2 (5.5) vs 29.3 (5.3) for noncarriers (P = .19). Conclusions and Relevance: Using whole-exome sequencing, we identified a single low-frequency variant in the MODY3-causing gene HNF1A that is associated with type 2 diabetes in Latino populations and may affect protein function. This finding may have implications for screening and therapeutic modification in this population, but additional studies are required.publishedVersio

    Low-Surface-Brightness Galaxies in the Sloan Digital Sky Survey. I. Search Method and Test Sample

    Full text link
    In this paper we present results of a pilot study to use imaging data from the Sloan Digital Sky Survey (SDSS) to search for low-surface-brightness (LSB) galaxies. For our pilot study we use a test sample of 92 galaxies from the catalog of Impey et al. (1996) distributed over 93 SDSS fields of the Early Data Release (EDR). Many galaxies from the test sample are either LSB or dwarf galaxies. To deal with the SDSS data most effectively a new photometry software was created, which is described in this paper. We present the results of the selection algorithms applied to these 93 EDR fields. Two galaxies from the Impey et al. test sample are very likely artifacts, as confirmed by follow-up imaging. With our algorithms, we were able to recover 87 of the 90 remaining test sample galaxies, implying a detection rate of \sim96.5%. The three missed galaxies fall too close to very bright stars or galaxies. In addition, 42 new galaxies with parameters similar to the test sample objects were found in these EDR fields (i.e., \sim47% additional galaxies). We present the main photometric parameters of all identified galaxies and carry out first statistical comparisons. We tested the quality of our photometry by comparing the magnitudes for our test sample galaxies and other bright galaxies with values from the literature. All these tests yielded consistent results. We briefly discuss a few unusual galaxies found in our pilot study, including an LSB galaxy with a two-component disk and ten new giant LSB galaxies.Comment: 36 pages, 16 figures, accepted for publication by AJ, some figures were bitmapped to reduce the siz

    Quantifying unobserved protein-coding variants in human populations provides a roadmap for large-scale sequencing projects

    Get PDF
    As new proposals aim to sequence ever larger collection of humans, it is critical to have a quantitative framework to evaluate the statistical power of these projects. We developed a new algorithm, UnseenEst, and applied it to the exomes of 60,706 individuals to estimate the frequency distribution of all protein-coding variants, including rare variants that have not been observed yet in the current cohorts. Our results quantified the number of new variants that we expect to identify as sequencing cohorts reach hundreds of thousands of individuals. With 500K individuals, we find that we expect to capture 7.5% of all possible loss-of-function variants and 12% of all possible missense variants. We also estimate that 2,900 genes have loss-of-function frequency of <0.00001 in healthy humans, consistent with very strong intolerance to gene inactivation.United States. National Institutes of Health (U54DK105566)United States. National Institutes of Health (R01GM104371

    High altitude adaptation in Daghestani populations from the Caucasus

    Get PDF
    We have surveyed 15 high-altitude adaptation candidate genes for signals of positive selection in North Caucasian highlanders using targeted re-sequencing. A total of 49 unrelated Daghestani from three ethnic groups (Avars, Kubachians, and Laks) living in ancient villages located at around 2,000 m above sea level were chosen as the study population. Caucasian (Adygei living at sea level, N = 20) and CEU (CEPH Utah residents with ancestry from northern and western Europe; N = 20) were used as controls. Candidate genes were compared with 20 putatively neutral control regions resequenced in the same individuals. The regions of interest were amplified by long-PCR, pooled according to individual, indexed by adding an eight-nucleotide tag, and sequenced using the Illumina GAII platform. 1,066 SNPs were called using false discovery and false negative thresholds of ~6%. The neutral regions provided an empirical null distribution to compare with the candidate genes for signals of selection. Two genes stood out. In Laks, a non-synonymous variant within HIF1A already known to be associated with improvement in oxygen metabolism was rediscovered, and in Kubachians a cluster of 13 SNPs located in a conserved intronic region within EGLN1 showing high population differentiation was found. These variants illustrate both the common pathways of adaptation to high altitude in different populations and features specific to the Daghestani populations, showing how even a mildly hypoxic environment can lead to genetic adaptation

    Base-specific mutational intolerance near splice sites clarifies the role of nonessential splice nucleotides

    Get PDF
    Variation in RNA splicing (i.e., alternative splicing) plays an important role in many diseases. Variants near 5' and 3' splice sites often affect splicing, but the effects of these variants on splicing and disease have not been fully characterized beyond the two "essential" splice nucleotides flanking each exon. Here we provide quantitative measurements of tolerance to mutational disruptions by position and reference allele-alternative allele combinations. We show that certain reference alleles are particularly sensitive to mutations, regardless of the alternative alleles into which they are mutated. Using public RNA-seq data, we demonstrate that individuals carrying such variants have significantly lower levels of the correctly spliced transcript, compared to individuals without them, and confirm that these specific substitutions are highly enriched for known Mendelian mutations. Our results propose a more refined definition of the "splice region" and offer a new way to prioritize and provide functional interpretation of variants identified in diagnostic sequencing and association studies.Peer reviewe

    Biallelic Variants in TTLL5, Encoding a Tubulin Glutamylase, Cause Retinal Dystrophy

    Get PDF
    In a subset of inherited retinal degenerations (including cone, cone-rod, and macular dystrophies), cone photoreceptors are more severely affected than rods; ABCA4 mutations are the most common cause of this heterogeneous class of disorders. To identify retinal-disease-associated genes, we performed exome sequencing in 28 individuals with “cone-first” retinal disease and clinical features atypical for ABCA4 retinopathy. We then conducted a gene-based case-control association study with an internal exome data set as the control group. TTLL5, encoding a tubulin glutamylase, was highlighted as the most likely disease-associated gene; 2 of 28 affected subjects harbored presumed loss-of-function variants: c.[1586_1589delAGAG];[1586_1589delAGAG], p.[Glu529Valfs∗2];[Glu529Valfs∗2], and c.[401delT(;)3354G>A], p.[Leu134Argfs∗45(;)Trp1118∗]. We then inspected previously collected exome sequence data from individuals with related phenotypes and found two siblings with homozygous nonsense variant c.1627G>T (p.Glu543∗) in TTLL5. Subsequently, we tested a panel of 55 probands with retinal dystrophy for TTLL5 mutations; one proband had a homozygous missense change (c.1627G>A [p.Glu543Lys]). The retinal phenotype was highly similar in three of four families; the sibling pair had a more severe, early-onset disease. In human and murine retinae, TTLL5 localized to the centrioles at the base of the connecting cilium. TTLL5 has been previously reported to be essential for the correct function of sperm flagella in mice and play a role in polyglutamylation of primary cilia in vitro. Notably, genes involved in the polyglutamylation and deglutamylation of tubulin have been associated with photoreceptor degeneration in mice. The electrophysiological and fundus autofluorescence imaging presented here should facilitate the molecular diagnosis in further families

    Genetic regulatory variation in populations informs transcriptome analysis in rare disease

    Get PDF
    Transcriptome data can facilitate the interpretation of the effects of rare genetic variants. Here, we introduce ANEVA (analysis of expression variation) to quantify genetic variation in gene dosage from allelic expression (AE) data in a population. Application of ANEVA to the Genotype-Tissues Expression (GTEx) data showed that this variance estimate is robust and correlated with selective constraint in a gene. Using these variance estimates in a dosage outlier test (ANEVA-DOT) applied to AE data from 70 Mendelian muscular disease patients showed accuracy in detecting genes with pathogenic variants in previously resolved cases and led to one confirmed and several potential new diagnoses. Using our reference estimates from GTEx data, ANEVA-DOT can be incorporated in rare disease diagnostic pipelines to use RNA-sequencing data more effectively
    corecore